Search results for "Hardware acceleration"
showing 9 items of 9 documents
Fast prototyping of a SoC-based smart-camera: a real-time fall detection case study
2014
International audience; Smart camera, i.e. cameras that are able to acquire and process images in real-time, is a typical example of the new embedded computer vision systems. A key example of application is automatic fall detection, which can be useful for helping elderly people in daily life. In this paper, we propose a methodology for development and fast-prototyping of a fall detection system based on such a smart camera, which allows to reduce the development time compared to standard approaches. Founded on a supervised classification approach, we propose a HW/SW implementation to detect falls in a home environment using a single camera and an optimized descriptor adapted to real-time t…
A novel hardware accelerator for the HEVC intra prediction
2015
International audience; A novel hardware accelerator for the High Efficiency Video Coding (HEVC) intra prediction is presented in this paper in order to reduce the computation complexity within this standard and to accelerate the concerned calculations. We propose a new pipelined structure that we called Processing Element (PE) to execute all angular modes, and we repeat it in five paths that our architecture composed of. We present also another structure to carry out the Planar mode. This architecture supports all intra prediction modes for all prediction unit sizes. The synthesis results show that our design can run at 213 MHz for Xilinx Virtex 6 and is capable to process real time 120 10…
Fully pipelined real time hardware solution for High Efficiency Video Coding (HEVC) intra prediction
2016
International audience; A fully pipelined hardware accelerator for the High Efficiency Video Coding (HEVC) intra prediction is presented in this paper in order to reduce the computation complexity coming with this module and to accelerate the concerned calculations. Two reconfigurable structures are developed in this paper, the first one concerns angular modes and is identified as Processing Element for Angular (PEA) modes, the other is made in order to handle with the Planar mode and is identified as Processing Element for the Planar (PEP) mode. Each structure is repeated in five paths, that our architecture composed of, working in parallel way. This architecture supports all intra predict…
Architecture-Driven Level Set Optimization: From Clustering to Sub-pixel Image Segmentation
2016
Thanks to their effectiveness, active contour models (ACMs) are of great interest for computer vision scientists. The level set methods (LSMs) refer to the class of geometric active contours. Comparing with the other ACMs, in addition to subpixel accuracy, it has the intrinsic ability to automatically handle topological changes. Nevertheless, the LSMs are computationally expensive. A solution for their time consumption problem can be hardware acceleration using some massively parallel devices such as graphics processing units (GPUs). But the question is: which accuracy can we reach while still maintaining an adequate algorithm to massively parallel architecture? In this paper, we attempt to…
Efficient smart-camera accelerator: A configurable motion estimator dedicated to video codec
2013
Smart cameras are used in a large range of applications. Usually the smart cameras transmit the video or/and extracted information from the video scene, frequently on compressed format to fit with the application requirements. An efficient hardware accelerator that can be adapted and provide the required coding performances according to the events detected in the video, the available network bandwidth or user requirements, is therefore a key element for smart camera solutions. We propose in this paper to focus on a key part of the compression system: motion estimation. We have developed a flexible hardware implementation of the motion estimator based on FPGA component, fully compatible with…
Feasibility of FPGA accelerated IPsec on cloud
2018
Abstract Hardware acceleration for famous VPN solution, IPsec, has been widely researched already. Still it is not fully covered and the increasing latency, throughput, and feature requirements need further evaluation. We propose an IPsec accelerator architecture in an FPGA and explain the details that need to be considered for a production ready design. This research considers the IPsec packet processing without IKE to be offloaded on an FPGA in an SDN network. Related work performance rates in 64 byte packet size for throughput is 1–2 Gbps with 0.2 ms latency in software, and 1–4 Gbps with unknown latencies for hardware solutions. Our proposed architecture is capable to host 1000 concurre…
PNeuro: A scalable energy-efficient programmable hardware accelerator for neural networks
2018
Proceedings of a meeting held 19-23 March 2018, Dresden, Germany; International audience; Artificial intelligence and especially Machine Learning recently gained a lot of interest from the industry. Indeed, new generation of neural networks built with a large number of successive computing layers enables a large amount of new applications and services implemented from smart sensors to data centers. These Deep Neural Networks (DNN) can interpret signals to recognize objects or situations to drive decision processes. However, their integration into embedded systems remains challenging due to their high computing needs. This paper presents PNeuro, a scalable energy-efficient hardware accelerat…
The Dynamical Kernel Scheduler - Part 1
2015
Emerging processor architectures such as GPUs and Intel MICs provide a huge performance potential for high performance computing. However developing software using these hardware accelerators introduces additional challenges for the developer such as exposing additional parallelism, dealing with different hardware designs and using multiple development frameworks in order to use devices from different vendors. The Dynamic Kernel Scheduler (DKS) is being developed in order to provide a software layer between host application and different hardware accelerators. DKS handles the communication between the host and device, schedules task execution, and provides a library of built-in algorithms. …
A Mobile Computing Framework for Pervasive Adaptive Platforms
2012
International audience; Ubiquitous computing is now the new computing trend, such systems that interact with their environment require self-adaptability. Bioinspiration is a natural candidate to provide the capability to handle complex and changing scenarios. This paper presents a programming framework dedicated to pervasive platforms programming. This bioinspired and agentoriented framework has been developed within the frame of the PERPLEXUS European project that is intended to provide support for bioinspiration-driven system adaptability. This framework enables the platform to adapt itself to application requirements at high-level while using hardware acceleration at node level. The resu…